Extracting Social Networks and Biographical Facts From Conversational Speech Transcripts
نویسندگان
چکیده
We present a general framework for automatically extracting social networks and biographical facts from conversational speech. Our approach relies on fusing the output produced by multiple information extraction modules, including entity recognition and detection, relation detection, and event detection modules. We describe the specific features and algorithmic refinements effective for conversational speech. These cumulatively increase the performance of social network extraction from 0.06 to 0.30 for the development set, and from 0.06 to 0.28 for the test set, as measured by f-measure on the ties within a network. The same framework can be applied to other genres of text — we have built an automatic biography generation system for general domain text using the same approach.
منابع مشابه
Ethnomethodology and Conversational Analysis
In a speech community, people utilize their communicative competence which they have acquired from their society as part of their distinctive sociolinguistic identity. They negotiate and share meanings, because they have commonsense knowledge about the world, and have universal practical reasoning. Their commonsense knowledge is embodied in their language. Thus, not only does social life depend...
متن کاملFROntIER: A Framework for Extracting and Organizing Biographical Facts in Historical Documents
The tasks of entity recognition through ontological commitment, fact extraction and organization in conformance to a target schema, and entity deduplication have all been examined in recent years, and systems exist that can perform each individual task. A framework combining all these tasks, however, is still needed to accomplish the goal of automatically extracting and organizing biographical ...
متن کاملLesion correlates of conversational speech production deficits.
We assess brain areas involved in speech production using a recently developed lesion-symptom mapping method (voxel-based lesion-symptom mapping, VLSM) with 50 aphasic patients with left-hemisphere lesions. Conversational speech was collected through a standardized biographical interview, and used to determine mean length of utterance in morphemes (MLU), type token ratio (TTR) and overall token...
متن کاملAcoustic Model Training with Detecting Transcription Errors in the Training Data
As the target of Automatic Speech Recognition (ASR) has moved from clean read speech to spontaneous conversational speech, we need to prepare orthographic transcripts of spontaneous conversational speech to train acoustic models (AMs). However, it is expensive and slow to manually transcribe such speech word by word. We propose a framework to train an AM based on easy-to-make rough transcripts ...
متن کاملIntention Extraction from Text Messages
Identifying intentions of users plays a crucial role in providing better user services, such as web-search and automated message-handling. There is a significant literature on extracting speakers’ intentions and speech acts from spoken words, and this paper proposes a novel approach on extracting intentions from non-spoken words, such as web-search query texts, and text messages. Unlike spoken ...
متن کامل